MLP Internal Representation as Discriminative Features for Improved Speaker Recognition

نویسندگان

  • Dalei Wu
  • Andrew C. Morris
  • Jacques C. Koreman
چکیده

Feature projection by non-linear discriminant analysis (NLDA) can substantially increase classification performance. In automatic speech recognition (ASR) the projection provided by the pre-squashed outputs from a one hidden layer multi-layer perceptron (MLP) trained to recognise speech subunits (phonemes) has previously been shown to significantly increase ASR performance. An analogous approach cannot be applied directly to speaker recognition because there is no recognised set of "speaker sub-units" to provide a finite set of MLP target classes, and for many applications it is not practical to train an MLP with one output for each target speaker. In this paper we show that the output from the second hidden layer (compression layer) of an MLP with three hidden layers trained to identify a subset of 100 speakers selected at random from a set of 300 training speakers in Timit, can provide a 77% relative error reduction for common Gaussian mixture model (GMM) based speaker identification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data

Short-term cepstral features have long been chosen as standard features for speaker recognition thanks to their relevance and effectiveness. In contrast, discriminative features, calculated by a multi-layer perceptron (MLP) from much longer stretches of time, have been gradually adopted in automatic speech recognition (ASR). It has been shown that augmenting short-term cepstral features with lo...

متن کامل

Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling

In this paper, we propose to incorporate the widely used Multiple Layer Perceptron (MLP) features and discriminative training (DT) into our recent data-sampling based ensemble acoustic models to further improve the quality of the individual models as well as the diversity among the models. We also propose applying speaker-model distance based speaker clustering for data sampling to construct en...

متن کامل

Integration of Face and Voice Recognition

cepstral features and features based on a bio–mechanical model of the visible articulators will be the identity–carrying characteristics extracted from acoustic speech and visual speech respectively. Speakers will be modelled by multi–layer perceptrons trained as discriminative models or, alternatively, as predictive models. In the discriminative modelling scheme, each speaker model will be tra...

متن کامل

Robustness to telephone handset distortion in speaker recognition by discriminative feature design

A method is described for designing speaker recognition features that are robust to telephone handset distortion. The approach transforms features such as mel-cepstral features, log spectrum, and prosody-based features with a non-linear arti®cial neural network. The neural network is discriminatively trained to maximize speaker recognition performance speci®cally in the setting of telephone han...

متن کامل

On speaker adaptive training of artificial neural networks

In the paper we present two techniques improving the recognition accuracy of multilayer perceptron neural networks (MLP ANN) by means of adopting Speaker Adaptive Training. The use of the MLP ANN, usually in combination with the TRAPS parametrization, includes applications in speech recognition tasks, discriminative features production for GMM-HMM and other. In the first SAT experiments, we use...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005